Goto

Collaborating Authors

 yolov5 model


Doppler-Enhanced Deep Learning: Improving Thyroid Nodule Segmentation with YOLOv5 Instance Segmentation

Hussieni, Mahmoud El

arXiv.org Artificial Intelligence

The increasing prevalence of thyroid cancer globally has led to the development of various computer-aided detection methods. Accurate segmentation of thyroid nodules is a critical first step in the development of AI-assisted clinical decision support systems. This study focuses on instance segmentation of thyroid nodules using YOLOv5 algorithms on ultrasound images. We evaluated multiple YOLOv5 variants (Nano, Small, Medium, Large, and XLarge) across two dataset versions, with and without doppler images. The YOLOv5-Large algorithm achieved the highest performance with a dice score of 91\% and mAP of 0.87 on the dataset including doppler images. Notably, our results demonstrate that doppler images, typically excluded by physicians, can significantly improve segmentation performance. The YOLOv5-Small model achieved 79\% dice score when doppler images were excluded, while including them improved performance across all model variants. These findings suggest that instance segmentation with YOLOv5 provides an effective real-time approach for thyroid nodule detection, with potential clinical applications in automated diagnostic systems.


Improved YOLOv5s model for key components detection of power transmission lines

Chen, Chen, Yuan, Guowu, Zhou, Hao, Ma, Yi

arXiv.org Artificial Intelligence

High-voltage transmission lines are located far from the road, resulting in inconvenient inspection work and rising maintenance costs. Intelligent inspection of power transmission lines has become increasingly important. However, subsequent intelligent inspection relies on accurately detecting various key components. Due to the low detection accuracy of key components in transmission line image inspection, this paper proposed an improved object detection model based on the YOLOv5s (You Only Look Once Version 5 Small) model to improve the detection accuracy of key components of transmission lines. According to the characteristics of the power grid inspection image, we first modify the distance measurement in the k-means clustering to improve the anchor matching of the YOLOv5s model. Then, we add the convolutional block attention module (CBAM) attention mechanism to the backbone network to improve accuracy. Finally, we apply the focal loss function to reduce the impact of class imbalance. Our improved method's mAP (mean average precision) reached 98.1%, the precision reached 97.5%, the recall reached 94.4%, and the detection rate reached 84.8 FPS (frames per second). The experimental results show that our improved model improves detection accuracy and has performance advantages over other models.


Enhancing Bidirectional Sign Language Communication: Integrating YOLOv8 and NLP for Real-Time Gesture Recognition & Translation

Bhuiyan, Hasnat Jamil, Mozumder, Mubtasim Fuad, Khan, Md. Rabiul Islam, Ahmed, Md. Sabbir, Nahim, Nabuat Zaman

arXiv.org Artificial Intelligence

The primary concern of this research is to take American Sign Language (ASL) data through real time camera footage and be able to convert the data and information into text. Adding to that, we are also putting focus on creating a framework that can also convert text into sign language in real time which can help us break the language barrier for the people who are in need. In this work, for recognising American Sign Language (ASL), we have used the You Only Look Once(YOLO) model and Convolutional Neural Network (CNN) model. YOLO model is run in real time and automatically extracts discriminative spatial-temporal characteristics from the raw video stream without the need for any prior knowledge, eliminating design flaws. The CNN model here is also run in real time for sign language detection. We have introduced a novel method for converting text based input to sign language by making a framework that will take a sentence as input, identify keywords from that sentence and then show a video where sign language is performed with respect to the sentence given as input in real time. To the best of our knowledge, this is a rare study to demonstrate bidirectional sign language communication in real time in the American Sign Language (ASL).


Target Detection of Safety Protective Gear Using the Improved YOLOv5

Liu, Hao, Qin, Xue

arXiv.org Artificial Intelligence

In high-risk railway construction, personal protective equipment monitoring is critical but challenging due to small and frequently obstructed targets. We propose YOLO-EA, an innovative model that enhances safety measure detection by integrating ECA into its backbone's convolutional layers, improving discernment of minuscule objects like hardhats. YOLO-EA further refines target recognition under occlusion by replacing GIoU with EIoU loss. YOLO-EA's effectiveness was empirically substantiated using a dataset derived from real-world railway construction site surveillance footage. It outperforms YOLOv5, achieving 98.9% precision and 94.7% recall, up 2.5% and 0.5% respectively, while maintaining real-time performance at 70.774 fps. This highly efficient and precise YOLO-EA holds great promise for practical application in intricate construction scenarios, enforcing stringent safety compliance during complex railway construction projects.


Precision and Adaptability of YOLOv5 and YOLOv8 in Dynamic Robotic Environments

Kich, Victor A., Muttaqien, Muhammad A., Toyama, Junya, Miyoshi, Ryutaro, Ida, Yosuke, Ohya, Akihisa, Date, Hisashi

arXiv.org Artificial Intelligence

Recent advancements in real-time object detection frameworks have spurred extensive research into their application in robotic systems. This study provides a comparative analysis of YOLOv5 and YOLOv8 models, challenging the prevailing assumption of the latter's superiority in performance metrics. Contrary to initial expectations, YOLOv5 models demonstrated comparable, and in some cases superior, precision in object detection tasks. Our analysis delves into the underlying factors contributing to these findings, examining aspects such as model architecture complexity, training dataset variances, and real-world applicability. Through rigorous testing and an ablation study, we present a nuanced understanding of each model's capabilities, offering insights into the selection and optimization of object detection frameworks for robotic applications. Implications of this research extend to the design of more efficient and contextually adaptive systems, emphasizing the necessity for a holistic approach to evaluating model performance.


Next-gen traffic surveillance: AI-assisted mobile traffic violation detection system

Dede, Dila, Sarsıl, Mehmet Ali, Shaker, Ata, Altıntaş, Olgu, Ergen, Onur

arXiv.org Artificial Intelligence

Road traffic accidents pose a significant global public health concern, leading to injuries, fatalities, and vehicle damage. Approximately 1,3 million people lose their lives daily due to traffic accidents [World Health Organization, 2022]. Addressing this issue requires accurate traffic law violation detection systems to ensure adherence to regulations. The integration of Artificial Intelligence algorithms, leveraging machine learning and computer vision, has facilitated the development of precise traffic rule enforcement. This paper illustrates how computer vision and machine learning enable the creation of robust algorithms for detecting various traffic violations. Our model, capable of identifying six common traffic infractions, detects red light violations, illegal use of breakdown lanes, violations of vehicle following distance, breaches of marked crosswalk laws, illegal parking, and parking on marked crosswalks. Utilizing online traffic footage and a self-mounted on-dash camera, we apply the YOLOv5 algorithm's detection module to identify traffic agents such as cars, pedestrians, and traffic signs, and the strongSORT algorithm for continuous interframe tracking. Six discrete algorithms analyze agents' behavior and trajectory to detect violations. Subsequently, an Identification Module extracts vehicle ID information, such as the license plate, to generate violation notices sent to relevant authorities.


Improvement and Enhancement of YOLOv5 Small Target Recognition Based on Multi-module Optimization

Li, Qingyang, Li, Yuchen, Duan, Hongyi, Kang, JiaLiang, Zhang, Jianan, Gan, Xueqian, Xu, Ruotong

arXiv.org Artificial Intelligence

In this paper, the limitations of YOLOv5s model on small target detection task are deeply studied and improved. The performance of the model is successfully enhanced by introducing GhostNet-based convolutional module, RepGFPN-based Neck module optimization, CA and Transformer's attention mechanism, and loss function improvement using NWD. The experimental results validate the positive impact of these improvement strategies on model precision, recall and mAP. In particular, the improved model shows significant superiority in dealing with complex backgrounds and tiny targets in real-world application tests. This study provides an effective optimization strategy for the YOLOv5s model on small target detection, and lays a solid foundation for future related research and applications.


Federated Ensemble YOLOv5 -- A Better Generalized Object Detection Algorithm

Hegiste, Vinit, Legler, Tatjana, Ruskowski, Martin

arXiv.org Artificial Intelligence

Federated learning (FL) has gained significant traction as a privacy-preserving algorithm, but the underlying resemblances of federated learning algorithms like Federated averaging (FedAvg) or Federated SGD (Fed SGD) to ensemble learning algorithms have not been fully explored. The purpose of this paper is to examine the application of FL to object detection as a method to enhance generalizability, and to compare its performance against a centralized training approach for an object detection algorithm. Specifically, we investigate the performance of a YOLOv5 model trained using FL across multiple clients and employ a random sampling strategy without replacement, so each client holds a portion of the same dataset used for centralized training. Our experimental results showcase the superior efficiency of the FL object detector's global model in generating accurate bounding boxes for unseen objects, with the test set being a mixture of objects from two distinct clients not represented in the training dataset. These findings suggest that FL can be viewed from an ensemble algorithm perspective, akin to a synergistic blend of Bagging and Boosting techniques. As a result, FL can be seen not only as a method to enhance privacy, but also as a method to enhance the performance of a machine learning model.


Motion Informed Object Detection of Small Insects in Time-lapse Camera Recordings

Bjerge, Kim, Frigaard, Carsten Eie, Karstoft, Henrik

arXiv.org Artificial Intelligence

Insects as pollinators play a crucial role in ecosystem management and world food production. However, insect populations are declining, calling for efficient methods of insect monitoring. Existing methods analyze video or time-lapse images of insects in nature, but the analysis is challenging since insects are small objects in complex and dynamic scenes of natural vegetation. In this work, we provide a dataset of primary honeybees visiting three different plant species during two months of the summer period. The dataset consists of 107,387 annotated time-lapse images from multiple cameras, including 9,423 annotated insects. We present a method pipeline for detecting insects in time-lapse RGB images. The pipeline consists of a two-step process. Firstly, the time-lapse RGB images are preprocessed to enhance insects in the images. This Motion-Informed-Enhancement technique uses motion and colors to enhance insects in images. Secondly, the enhanced images are subsequently fed into a Convolutional Neural network (CNN) object detector. The method improves the deep learning object detectors You Only Look Once (YOLO) and Faster Region-based CNN (Faster R-CNN). Using Motion-Informed-Enhancement, the YOLO-detector improves the average micro F1-score from 0.49 to 0.71, and the Faster R-CNN-detector improves the average micro F1-score from 0.32 to 0.56 on the dataset. Our dataset and proposed method provide a step forward to automate the time-lapse camera monitoring of flying insects. The dataset is published on: https://vision.eng.au.dk/mie/


Leveraging object detection for the identification of lung cancer

Gunasekaran, Karthick Prasad

arXiv.org Artificial Intelligence

Lung cancer poses a significant global public health challenge, emphasizing the importance of early detection for improved patient outcomes. Recent advancements in deep learning algorithms have shown promising results in medical image analysis. This study aims to explore the application of object detection particularly YOLOv5, an advanced object identification system, in medical imaging for lung cancer identification. To train and evaluate the algorithm, a dataset comprising chest X-rays and corresponding annotations was obtained from Kaggle. The YOLOv5 model was employed to train an algorithm capable of detecting cancerous lung lesions. The training process involved optimizing hyperparameters and utilizing augmentation techniques to enhance the model's performance. The trained YOLOv5 model exhibited exceptional proficiency in identifying lung cancer lesions, displaying high accuracy and recall rates. It successfully pinpointed malignant areas in chest radiographs, as validated by a separate test set where it outperformed previous techniques. Additionally, the YOLOv5 model demonstrated computational efficiency, enabling real-time detection and making it suitable for integration into clinical procedures. This proposed approach holds promise in assisting radiologists in the early discovery and diagnosis of lung cancer, ultimately leading to prompt treatment and improved patient outcomes.